New York City is amongst the most populated cities in the world with about 8.6 million people (2017). It is also the melting pot of America with representation from all races.
We came across many articles speaking about racial segregation in NYC. https://www.nytimes.com/2019/07/16/nyregion/segregation-nyc-affordable-housing.html refers to how the city’s policy of giving preference to local residents for new affordable housing helps perpetuate racial segregation. As per https://www.nytimes.com/2019/03/26/nyregion/school-segregation-new-york.html, NYC public schools are still struggling with racial segregation.
We were curious to look at demographic data to find out the racial distribution of population in New York City to help answer the following questions:
We found multiple datasets are available for demographic and race data. We chose to focus on the following datasets :
https://data.census.gov/cedsci/profile?g=0400000US36&q=New%20York
Census data is collected every 10 years (decennial survey). Last survey was conducted in 2010.
The other alternative survey peformed by sampling the population is the American Community Survey (ACS), performed yearly. The datasets available from ACS are either yearly or average of past 5 years. The 5 year average is considered the most reliable. We choose to use the 2013-2017 5 year average data for our study.
The ACS data contains population numbers for the following races
To analyze how segregated are NYC schools among the zip codes, below datasets were jonned with the demographic dataset, to source the zip codes for each school.
We found choroplethr packages to map population by demographics to boroughs and zipcodes on a choropleth map of NYC.
Choropleth maps allow us to make comparisons before data corresponding to geographical areas spatially. The borough maps were useful to get a sense of where each race was concentrated at the Borough level.
Observations from Borough Choropleth maps:
Choroplethr maps by zip code allowed us to identify clusters of zip codes where each race was concentrated.
Observations from Choropleth maps By Zipcodes:
We noticed that Zipcodes in NYC have unique patterns per Borough:
We can get a sense of racial concentration by borough by plotting the populations per race against the sorted list of Zip Codes.
Observations from Bar Plots Per Zipcode for Individual Races:
Next we compared the populations between races across all the zip codes. This was to identify if there was a larger concentration of any one race wrt others across the zipcodes in NYC.
Observations from Stacked Bar Plots Per Zipcode:
Next we studied each neighborhood in NYC as per neighborhoods listed here - https://www.health.ny.gov/statistics/cancer/registry/appendix/neighborhoods.htm
The goal was to find neighborhoods where populations from one race dominated.
Observations about Neighborhoods in Bronx:
Observations about Neighborhoods in Brooklyn:
Observations about Neighborhoods in Manhattan:
Observations about Neighborhoods in Queens:
Observations about Neighborhoods in Staten Island:
This simple bar charts shows total number of students enrolled in each Borough.
Brooklyn shows higher number of students enrolled over the 5 years of 2013 - 2018 and Staten Island seems to have the lowest of the 5 boroughs.
We extracted few required variable to look at the percentage of enrolled students in NYC schools and plot its distribution for each race.
The bar chart shows a higher concentration of Hispanic and Black students, which is a significant observation that impacts few observations made below as we proceed further in the analysis.
We observe the following patterns in the stacked bar chart above for analysis on different ethnicities in NYC schools for each Borough.
For each school we find the most common race among the enrolled students. In the chart below we compute the percentage of schools in each borough with a particular race as the most common one. With this, we are trying to see if the population of students of a particular race is concenrated in a few regions of the borough or if it is distributed more evenly.
In the next two plots we intend to show how segregation score changing over time, during the 5 years of data used in the project.
For this we defined a metric called seggregation_score to capture the level of segregation of each school in New York city. This allows us to compare different schools with different distribution of students’ ethnicities. seggregation_score = prop_common - mean(prop_others) where prop_common is the proportion of the most common race in the school and prop_others is the propertions of the other races.
This can be simplfied to: seggregation_score = (1.25 * prop_common) - 25 We know that this only considers the most common ethnicity in each school and hence does not differentiates schools based on the proportion of other ethnicities.
The Choropleth map below shows the racial segregation of NYC schools by zip code for the 5 years of data.
Zip codes with highest segregations score seems to be highly clustered in few regions on the above map.
All ethnicites seems to be highly clustered in the map above which could be due to the underlying population being clustered along racial lines. It is very surprising to see such contiguous clusters with the same race being the most common one.
Following two plots show the consistency between NYC schools enrollment with underlying population which has been plotted above under the first section of this project under the name Distribution of Races by Zip Code. The second plot specifically shows the school enrollment for the year of 2017, which shows that the highly populated areas have higher number of students enrolled to the schools.
Poverty is worth considering when talking about school diversity.
We observe Black and Hispanic students are much more likely to attend a school where more than 75% of students experience poverty.
Following plot shows a similar pattern as above, we observe higher economic need index for Black and Hispanic students in NYC over the 5 years of data.
https://dlab.berkeley.edu/sites/default/files/training_materials/Census_Lecture_030915.pdf https://rdrr.io/cran/choroplethr/man/county_choropleth_acs.html https://arilamstein.com/creating-zip-code-choropleths-choroplethrzip/ https://www.trulia.com/blog/tech/the-choroplethr-package-for-r/